Cross-component Clustering for Template Induction
نویسندگان
چکیده
We suggest an unsupervised approach to template induction for information extraction, through detecting sub-topics and themes that cut across the documents of a topical corpus. We introduce a new method cross component clustering that simultaneously clusters the components forming our setting, each of which consists of the words of a single article. Our algorithm is derived from the Information Bottleneck clustering algorithm. The resulting clusters are found to be in systematic correspondence with sets of terms that are used in filling the slots of the MUC3/4 ready-made template, which was used for evaluation.
منابع مشابه
Using Clustering and Factor Analysis in Cross Section Analysis Based on Economic-Environment Factors
Homogeneity of groups in studies those use cross section and multi-level data is important. Most studies in economics especially panel data analysis need some kinds of homogeneity to ensure validity of results. This paper represents the methods known as clustering and homogenization of groups in cross section studies based on enviro-economics components. For this, a sample of 92 countries which...
متن کاملSynthesis and Evaluating of Nanoporous Molecularly Imprinted Polymers for Extraction of Quercetin as a Bioactive Component of Medicinal Plants
In this work, the template, monomer, and cross-linker with the ratio of 1:8:40 were used to synthesize Molecularly Imprinted Polymers (MIPs) for extraction of the bioactive chemical compounds from some traditional herbs as a sorbent material. Quercetin, Methacrylic Acid (MAA), Trimethylolpropanetrimethacrylate (TRIM) and Tetrahydrofuran (THF) were used as a template, funct...
متن کاملUnmixed Spectrum Clustering for Template Composition in Lung Sound Classification
In this paper, we propose a method for composing templates of lung sound classification. First, we obtain a sequence of power spectra by FFT for each given lung sound and compute a small number of component spectra by ICA for each of the overlapping sets of tens of consecutive power spectra. Second, we put component spectra obtained from various lung sounds into a single set and conduct cluster...
متن کاملTemplate Matching using Statistical Model and Parametric Template for Multi-Template
This paper represents a template matching using statistical model and parametric template for multi-template. This algorithm consists of two phases: training and matching phases. In the training phase, the statistical model created by principal component analysis method (PCA) can be used to synthesize multi-template. The advantage of PCA is to reduce the variances of multi-template. In the matc...
متن کاملEvaluation of Similarity Measures for Template Matching
Image matching is a critical process in various photogrammetry, computer vision and remote sensing applications such as image registration, 3D model reconstruction, change detection, image fusion, pattern recognition, autonomous navigation, and digital elevation model (DEM) generation and orientation. The primary goal of the image matching process is to establish the correspondence between two ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002